6 research outputs found
Accelerated face detector training using the PSL framework
We train a face detection system using the PSL framework [1] which combines the AdaBoost
learning algorithm and Haar-like features. We demonstrate the ability of this framework to
overcome some of the challenges inherent in training classifiers that are structured in cascades
of boosted ensembles (CoBE). The PSL classifiers are compared to the Viola-Jones type cas-
caded classifiers. We establish the ability of the PSL framework to produce classifiers in a
complex domain in significantly reduced time frame. They also comprise of fewer boosted en-
sembles albeit at a price of increased false detection rates on our test dataset. We also report
on results from a more diverse number of experiments carried out on the PSL framework in
order to shed more insight into the effects of variations in its adjustable training parameters
A novel bootstrapping method for positive datasets in cascades of boosted ensembles
We present a novel method for efficiently training a face detector using large positive
datasets in a cascade of boosted ensembles. We extend the successful Viola-Jones [1] framework
which achieved low false acceptance rates through bootstrapping negative samples with the
capability to also bootstrap large positive datasets thereby capturing more in-class variation
of the target object. We achieve this form of bootstrapping by way of an additional embedded
cascade within each layer and term the new structure as the Bootstrapped Dual-Cascaded
(BDC) framework. We demonstrate its ability to easily and efficiently train a classifier on
large and complex face datasets which exhibit acute in-class variation
A reconfigurable hybrid intelligent system for robot navigation
Soft computing has come of age to o er us a wide array of powerful and e cient algorithms
that independently matured and in
uenced our approach to solving problems in robotics,
search and optimisation. The steady progress of technology, however, induced a
ux of new
real-world applications that demand for more robust and adaptive computational paradigms,
tailored speci cally for the problem domain. This gave rise to hybrid intelligent systems, and
to name a few of the successful ones, we have the integration of fuzzy logic, genetic algorithms
and neural networks. As noted in the literature, they are signi cantly more powerful than
individual algorithms, and therefore have been the subject of research activities in the past
decades. There are problems, however, that have not succumbed to traditional hybridisation
approaches, pushing the limits of current intelligent systems design, questioning their solutions
of a guarantee of optimality, real-time execution and self-calibration. This work presents an
improved hybrid solution to the problem of integrated dynamic target pursuit and obstacle
avoidance, comprising of a cascade of fuzzy logic systems, genetic algorithm, the A* search
algorithm and the Voronoi diagram generation algorithm
Face tracking using a hyperbolic catadioptric omnidirectional system
In the first part of this paper, we present a brief review on catadioptric omnidirectional
systems. The special case of the hyperbolic omnidirectional system is analysed in depth.
The literature shows that a hyperboloidal mirror has two clear advantages over alternative
geometries. Firstly, a hyperboloidal mirror has a single projection centre [1]. Secondly, the
image resolution is uniformly distributed along the mirror’s radius [2].
In the second part of this paper we show empirical results for the detection and tracking
of faces from the omnidirectional images using Viola-Jones method. Both panoramic and
perspective projections, extracted from the omnidirectional image, were used for that purpose.
The omnidirectional image size was 480x480 pixels, in greyscale. The tracking method used
regions of interest (ROIs) set as the result of the detections of faces from a panoramic projection
of the image. In order to avoid losing or duplicating detections, the panoramic projection was
extended horizontally. Duplications were eliminated based on the ROIs established by previous
detections. After a confirmed detection, faces were tracked from perspective projections (which
are called virtual cameras), each one associated with a particular face. The zoom, pan and tilt
of each virtual camera was determined by the ROIs previously computed on the panoramic
image.
The results show that, when using a careful combination of the two projections, good frame
rates can be achieved in the task of tracking faces reliably
A new 2D static hand gesture colour image dataset for ASL gestures
It usually takes a fusion of image processing and machine learning algorithms in order to
build a fully-functioning computer vision system for hand gesture recognition. Fortunately,
the complexity of developing such a system could be alleviated by treating the system as a
collection of multiple sub-systems working together, in such a way that they can be dealt
with in isolation. Machine learning need to feed on thousands of exemplars (e.g. images,
features) to automatically establish some recognisable patterns for all possible classes (e.g.
hand gestures) that applies to the problem domain. A good number of exemplars helps, but
it is also important to note that the efficacy of these exemplars depends on the variability
of illumination conditions, hand postures, angles of rotation, scaling and on the number of
volunteers from whom the hand gesture images were taken. These exemplars are usually
subjected to image processing first, to reduce the presence of noise and extract the important
features from the images. These features serve as inputs to the machine learning system.
Different sub-systems are integrated together to form a complete computer vision system for
gesture recognition. The main contribution of this work is on the production of the exemplars.
We discuss how a dataset of standard American Sign Language (ASL) hand gestures containing
2425 images from 5 individuals, with variations in lighting conditions and hand postures is
generated with the aid of image processing techniques. A minor contribution is given in
the form of a specific feature extraction method called moment invariants, for which the
computation method and the values are furnished with the dataset